Skip to content

Add option to expand ensembles#206

Merged
mattiarighi merged 7 commits into
developmentfrom
ensemble_expansion
Oct 2, 2019
Merged

Add option to expand ensembles#206
mattiarighi merged 7 commits into
developmentfrom
ensemble_expansion

Conversation

@jvegreg

@jvegreg jvegreg commented Aug 19, 2019

Copy link
Copy Markdown
Contributor

In CMIP6, some models are providing 30 ensemble members or more. Processing them all is currently a nightmare of copy pasting and very error prone:

The currently implemented syntax is the following:

  - {<<: *cmip, dataset: IPSL-CM6A-LR, ensemble: 'r[1:30]i1p1f1'}

This will add all datasets from r1i1p1f1 to r30i1p1f1, making dealing with big ensembles easier.

Any suggestions on the syntax are welcome. The current code only supports expanding one of the numbers.

@bouweandela

Copy link
Copy Markdown
Member

Any suggestions on the syntax are welcome. The current code only supports expanding one of the numbers.

Nice addition. I think that normal glob rules provided by fnmatch would be easiest to understand for most users, i.e. '*', '?', '[0-9]', etc. It would be nice to have these available not just for ensemble members, but also for dataset name, start_year, end_year, ...

@jvegreg

jvegreg commented Aug 29, 2019

Copy link
Copy Markdown
Contributor Author

Nice addition. I think that normal glob rules provided by fnmatch would be easiest to understand for most users, i.e. '*', '?', '[0-9]', etc. It would be nice to have these available not just for ensemble members, but also for dataset name, start_year, end_year, ...

This will be nice, but they are two different kind of animals:

  • This pull request aims to define multiple datasets with only one definition, but in a strict way. If any of those is missing, recipe will fail

  • fn_match will be more like find anything matching this pattern and run it. Will be more difficult, for example, to easily run using inly the first 15 members and ignoring the rest. Also, it is more difficult to implement, as we need to check which ones are available from all the possible combinations

What we can do to avoid confussion is to change the syntax to avoid both paradigms to coexist.

@bouweandela

Copy link
Copy Markdown
Member

find anything matching this pattern and run it.

This is a feature that several users have asked me about. I agree that it is not trivial to implement.

What we can do to avoid confussion is to change the syntax to avoid both paradigms to coexist.

Yes, I think it's important to think about this. Of course we also have the --skip-nonexistent option available..

@jvegreg jvegreg added the enhancement New feature or request label Aug 30, 2019
@jvegreg jvegreg mentioned this pull request Sep 17, 2019
@bouweandela

Copy link
Copy Markdown
Member

What we can do to avoid confussion is to change the syntax to avoid both paradigms to coexist.

I think the square brackets are not good in that case, because those can also be used in globs. Maybe use (1:30) instead?

@jvegreg

jvegreg commented Sep 20, 2019

Copy link
Copy Markdown
Contributor Author

Good idea. Already implemented

@bouweandela bouweandela left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add documentation? This might be a good place.

@mattiarighi mattiarighi merged commit 4d7e2c5 into development Oct 2, 2019
@mattiarighi mattiarighi deleted the ensemble_expansion branch October 2, 2019 09:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants